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For: A TECHNIQUE FOR PROVIDING CONTINUOUS SPEECH RECOGNITION AS AN 
ALTERNATE INPUT DEVICE TO LIMITED PROCESSING POWER DEVICES 



Pursuant to the Pre- Appeal Brief Conference Pilot Program, and further to the Examiner's Final 
Office Action dated August 11, 2009, Applicant files this Pre- Appeal Brief Request for Review. This 
Request is also accompanied by the filing of a Notice of Appeal. 

Applicant turns now to the rejections at issue: 

Claims 1-3, 5-16, 18-29, and 31-40 are rejected under 35 U.S.C. § 103(a) as being unpatentable 
over Hedin et al. (U.S. Patent No. 6,185,535, hereafter "Hedin") in view of King (U.S. Patent No. 
6,532,446) and D'hoore et al. (U.S. Patent No. 6,085,160, hereafter "D'hoore"). 

Applicant respectfully submits that there is no teaching or suggestion in the cited references of 
"transmitting the voice data and a device identifier to a computer", as recited in independent claim 1 and 
analogously recited in independent claims 14 and 27. 
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The Examiner cites column 1, lines 21-34, column 4, lines 62-63, and column 5, lines 20-22 of 

Hedin as allegedly inherently teaching this element of the claims.- Specifically, the Examiner asserts that 

since Hedin teaches using Wireless Application Protocol (WAP) using the Wireless Markup Language 

(WML), then this teaching of Hedin "inherently" means that the WAP includes device identifiers. 

Applicant respectfully disagrees with the Examiner. 

Hedin merely teaches that WAP using WML enables terminals with small displays, limited 
processing power and low data transmission bandwidth to access and control services and content in a 
service network. The simple syntax in WML makes WAP suitable for controlling the service. The WML 
may be used in wireless mobile devices because its cards and scripts or libraries can be used to create 
applications that extend services available in mobile networks. 

However there is no teaching or suggestion in Hedin that a device identifier is transmitted to the 
computer as recited in the claims, nor would it be inherent that a device identifier is transmitted to the 
computer simply because Hedin teaches using a WAP standard. According to an exemplary embodiment 
of the present invention, the client device generates a speech packet that consists of voice data, data 
relating to a target application, and a client device identifier. Accordingly, the formatting of the translated 
text may be for a particular application and a particular client device (see for example, page 6 of the 
original specification). Hedin merely teaches that data may be communicated over a digital link 105 in 
the form of cards and scripts/libraries created by a standardized markup language such as WML. 
However, Hedin does not teach or suggest, inherently or explicitly, "transmitting the voice data and a 
device identifier to a computer", as claimed. 

2 Pages 4-5 of the Office Action dated August 1 1, 2009. 
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Secondly, Applicant submits that there is no teaching or suggestion in Hedin of " determining 

whether to filter the translated text; and if it is determined that the translated text is to be filtered, applying 

a filter to the translated text ", as recited in independent claim 1 and analogously recited in independent 

claims 14 and 27. 

The Examiner cites column 5, lines 43-55 and column 6, lines 16-20 of Hedin as allegedly 
teaching these elements of the claims. However, Hedin teaches that a first digital link 105 connects a 
client part to a gateway/proxy part 107, and a second link 1 1 1 connects the gateway/proxy part to a server 
109. The data that is communicated over the first link may comprise a different data format from the data 
that is communicated over the second link. If the formats are different, then some filtering may be done 
in order to eliminate data that cannot be received or processed by the user terminal. For example, if 
graphical information cannot be displayed on the terminal, then the graphical information is eliminated 
from the data that is being transmitted to the terminal, and only data that is appropriate (or data that can 
be processed by the terminal) is communicated to the terminal (see column 5, lines 43-55 of Hedin). 

Accordingly, the "filtering" as taught by Hedin is used to prevent predetermined data from being 
transmitted to a terminal, and is not used to filter voice data that has been translated to text . 

Finally, Applicant submits that there is no teaching or suggestion in the cited references that "the 
voice data is translated to text using a voice print", as recited in claim 1 and analogously recited in claims 
14 and 27. 

The Examiner acknowledges that Hedin and King do not teach or suggest that "the voice data is 
translated to text using a voice print", as recited in the claims. The Examiner thus relies on D'hoore to 
remedy this deficiency. Applicant again disagrees with the Examiner's position. 

The Examiner continues to assert that the claimed element "the voice data is translated to text 
using a voice print", allegedly reads on the teaching by D'hoore that the speech recognition system is 
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restricted to mapping speech onto language specific symbols (column 7, lines 32-38 of D'hoore), and that 
voice prints can be used to recognize utterances of the trained word by the speaker (column 7, lines 49- 
51). The Examiner also appears to assert that the claimed "text" allegedly reads on the "symbols" as 
taught by D'hoore. Applicant respectfully disagrees with the Examiner. 

First, the "symbols" as taught by D'hoore refers to sound , and not text as asserted by the 
Examiner. D'hoore teaches a speech recognition system wherein a speech pre-processor receives input 
speech and produces a speech-related signal representative of the input speech (column 1, lines 41-47). 
Once the input speech signal has been pre-processed, a speech recognizer compares the speech signal to 
acoustic models in a phoneme (or symbol) database together with a language model. Universal language 
independent phonemes, (which represent the fundamental sounds of a given language), may be 
constructed using a phonetic alphabet designed to cover all languages which represent each sound by a 
single symbol, wherein each symbol represents a single sound (column 4, lines 42-55). To provide 
speech recognition in a given language, a set of symbols (phonemes) are defined which represent all 
sounds of that language (column 1, lines 10-17). Accordingly, D'hoore clearly teaches that the 
"symbols" relate to sound and not text , as asserted by the Examiner. Therefore, D'hoore does not teach 
or suggest that voice data is translated to text. Instead, D'hoore teaches a speech recognition system 
which produces a speech related signal from inputted speech. 

Further, Applicant respectfully submit that there is no teaching or suggestion in D'hoore that 
"voice data is translated to text using a voice print", as claimed. D'hoore uses algorithms such as speaker 
dependent training of words to try to find the best possible phonetic representation for a particular word 
based on a few utterances of that word by the user. A word can be added to the recognizer by having the 
user pronounce the word a few times, and the system automatically constructs the best possible phoneme 
or model unit sequence to describe the word based on the uttered speech. This sequence is the voice print 
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(column 7, lines 32-54). Accordingly, the voice print of D'hoore is used to recognize utterances of the 
trained word by the speaker and to match the speech of the targeted speaker. D'hoore does not use the 
voice print to translate voice data to text, but uses the voice print to recognize words of different 
languages. 

The Examiner continues to read subject matter into D'hoore that is simply not taught or suggested 
by the reference. As discussed above, D'hoore describes a system which allows a speech recognition 
system to recognize different pronunciations of words in different languages. D'hoore's voice print is 
used to produce phoneme models of the sounds of words uttered by non-native speakers. D'hoore does 
not teach or suggest using the voice print to translate voice data into text as required by the claims. 

Accordingly, the cited references, alone or in combination, do not teach or suggest all of the 
features of the claims. Therefore, Applicant respectfully submits that claims 1, 14, and 27 should be 
allowable because the cited references do not teach or suggest all of the features of the claims. Claims 2, 
3, 5-13, 15, 16, 18-27, 28, 29, and 31-40 should also be allowable at least by virtue of their dependency 
on independent claims 1,14, and 27. 

Respectfully submitted, 

/Mark E. Wallerson/ 

Mark E. Wallerson 
Registration No. 59,043 

Date: November 1 1 , 2009 
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